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Abstract: The estimation of the density matrix of a fc-level quantum system 
is studied when the parametrization is given by the real and imaginary part 
of the entries and they are estimated by independent measurements. It is 
established that the properties of the estimation procedure depend very much 
on the invertibility of the true state. In particular, in case of a pure state 
the estimation is less efficient. Moreover, several estimation schemes are 
compared for the unknown state of a qubit when one copy is measured at a 
time. It is shown that the average mean quadratic error matrix is the smallest 
if the applied observables are complementary. The results are illustrated by 
computer simulations. 
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1 Introduction 

The problem of inferring the state of a quantum system from measurement data is 
fundamental. One side of this problem is the adequate experimental techniques, and the 
other side is the theory based on the adaptation of statistics to the quantum mechanical 
formalism. 
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Most of the work in state estimation has focused on states of a qubit, pure states [5], 
or mixed states [TJ [HI [TOj - The estimation procedure for pure states is simpler, partially 
due to the smaller number of parameters. The subject of the present paper is state 
estimation for a fc-level quantum system. In this case the boundary of the state space 
is not the set of pure states but the non-invertible density matrices. The entries of 
the density matrix provide a natural parametrization of the state space. The accuracy 
of the estimation can be quantified by the fidelity or by the Hilbert-Schmidt distance. 
For larger matrices the latter seems to be easier to handle. When different estimation 
schemes are compared, the mean quadratic error matrix can be used. 



2 The estimation scheme 

The goal of state estimation is to determine the density operator p of a quantum system 
by measurements on n copies of the quantum system which are all prepared according 
to p El El The number n corresponds to the sample size in classical mathematical 
statistics. An estimation scheme means a measurement and an estimate for every n. For 
a reasonable scheme, we expect the estimation error to tend to when n tends to infinity 
as a consequence of the law of large numbers. 

Assume that p is the density matrix of our system described on the Hilbert space 
TC. Then the n identical copies are described by the ra-fold tensor product 7i n := 7i n ® 
and the state is p n := p n ®. When dim7i = k, we can identify the operators of H n with 
matrices of k n x k n . In this paper we study measurement schemes given by self-adjoint 
matrices 

A(n) = (A(n) y )J i=1 , (1) 

where A(n)y G B(TC n ). Note that A(n) is determined by k 2 self-adjoint operators acting 
on 7i n . They are the diagonal entries Z(n)u = A(n)u of A(n), moreover the off-diagonal 
entries are written as 

A(n)ij = X(n)ij + iY{n)ij (i < j) 

by means of self-adjoint X(n)ij and Y(n)ij. The measurement scheme A(n) means that 
the observables Z(n)u, X(n)ij and Y( measured on the r copies of the original 

system. Since the sum of the diagonal entries of a density matrix is 1, it is enough to 
measure k — 1 diagonal entries, for example, Z(n)kk can be removed from the set of 
observables to be measured and k 2 — 1 observables remain. (Hence n = r(k 2 — 1).) 

Example 1 Let k = 2 and 

S n {<Ti) = -{<Ti ®/®...®/ + /®(Ti®/®...®/ + ... + /®/®...®(T i )G B(H n ), 

n 

where 1 < % < 3 and crj are the Pauli matrices. Set 

if^-K J ™ + S n (ai) - iS n {a 2 ) \ . s 

A[H) ~ 2 1 Snfa) + \S n {a 2 ) I n - S n (a 3 ) J ' [Z) 
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where I n denotes the identity on H n . 

We may have a better understanding of this estimation scheme if the n-fold product 
is considered to be embedded into the infinite product. Then the limit n — > oo is more 
visible. If 

( Pn Pn\ 
V P21 P22 / 

then the law of large numbers guarantees that 

where 1^ denotes the identity in the infinite tensor product. Therefore, the error is going 
to when n goes to 00 for any reasonable definition of the error. 

The entries of the matrix (J2J) do not commute, therefore there is no joint Kolmogoro- 
vian model for them and the observables cannot be measured simultaneously. We modify 
this matrix, in order to use standard probabilistic tools. 

On the infinite tensor product £g> M k ® ... we introduce the right shift 7: 
j(H t ®H 2 ®...H n ®I k ®I k ...)=I k ®H 1 ®H 2 ®...H n ®I k ®I k ... 
Now we set 

k 1 / I r + S r (a 3 ) Y(S r (*l)) ~ h 2r (S r (a 2 )) 

W 2 V 7 r (^(^i)) + i7 2r (5 r (a 2 )) I r - S r fa) 

The operators S r (a 3 ), 7 r (S , r (o"i)) and 7 2r (5 , r (cr 2 )) commute. They may be regarded as 
classical random variables, one can speak about their joint distribution, variance etc. □ 




The very concrete estimation scheme we use will be the natural extension of Example 
Denote by Ey the k x k matrix units and set 

Z u := Y (i ' l) (E u ) (l<i<k), 
Xi d := ^(Eij + Eji) (i<j), 
Y id := ^(iEij-iEji) (i< ./). 

where r : : 1 < i,j < k,(i,j) 7^ (k, k)} — > {1, 2, . . . , k 2 — 1} is an arbitrary 

bijection. These self-adjoint operators commute and behave as independent random 
variables. The spectrum of Zu is {0, 1} and the spectrum of and is { — 1,0, 1}. 
The matrix A(k 2 — 1) is determined by these operator entries. 

Finally, the estimation scheme A(r(k 2 — 1)) is defined by the formulas 

1 r_1 

Z(r(k 2 - 1))« := -$> m(fc2-1) (^«) (1 < i < A;), 

m=0 

X(r(k 2 — l))y := if^" 11 ^) (i<j), 

m=0 
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Y(r(k 2 - l)) y := ^ 7 m(fca - 1) (^) (< < J - )- 

m=0 

Note that to carry on the measurement of all these observable r(k 2 — 1) copies of the orig- 
inal quantum system are needed. The entries of A(n) are commuting observables, there- 
fore there is a basis in H n such that all of them are diagonal in this basis. Consequently 
a single measurement can be performed theoretically instead of the measurements of the 
k 2 — 1 observables (n = r(k 2 — 1)). 

Our aim is to estimate the k x k density matrix p of a quantum system. The 
parametrization is naturally given by the entries of the matrix. In what follows, we 
are given several copies of a fc-level quantum system in the same state. We perform 
measurements on the systems one after another, that is, a system is measured only once, 
the next measurement is performed on the next copy of the system, so the states of the 
systems after the measurement are irrelevant from our viewpoint. 

If we want to estimate the real part of ij entry of the density matrix p, then we 
measure the observable Eij + E^. Its spectral decomposition is 

and its measurement has three different outcomes, ±1 and 0. The probabilities of the 
outcomes ±1 are 

Prob(X ii = ±1) = ^(pu ± p i: j ± pji + pjj) = ^(pu + pjj) ± Repij . 

To estimate the imaginary part, we measure iEij — iEji with spectral decomposition 

The probabilities are 

Prob(F ij = ±1) = -( Pii ± i Pij =p i Pji + Pjj ) = -( Pii + Pjj ) ± ilmp^ . 

Finally, for the diagonal ii entry we have 

Prob(Zii = 1) = pa. 

All the three kinds of measurement are performed r times. If M is one of the mea- 
surements which has outcome t, the we denote by u(r, M, t) the relative frequency of t 
when the measurement is performed r times. According to the law of large numbers, 
z/(r, M, t) — > Prob(M = t) as r — > oo. The following estimate is natural: 
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(i) $™(p)« = u(r, Z u , 1) for (1 < i < k) and 

fc-i 

*Hp)** = i-Z>(r,2«),, 

»=i 

(ii) Re$r(p)y = i(z/(r,X iJ; 1) - u(r,X ij: -1)) for i < j. 

(iii) Im $r(p)y = |(^(r, Fjj, 1) - u(r, % -1)) for i < j. 

In our notation, "im" is an abbreviation of the word " unconstrained 1 . It may happen 
that $^ n (p) is not a positive semidefinite matrix, hence it is not an estimate in the really 
strict sense. Let M.^ denote the set of all self-adjoint k x k matrices of trace 1. The 
estimate $5T takes its values in M. k . Note that the set of invertible density matrices 
form an open subset of M. k - 

Given a true state p, $™ is a matrix-(or vector-)valued random variable which is the 
mean of r independent copies of $Q n . Let G C M.^ be an open set such that p 6 G. 
According to the law of large numbers 

Prob($7 i G) -> 0, 

however according to the large deviation theorem the convergence is exponentially 
fast: 

Prob($r i G) < Cexp(-nE G ), 
where Eq > is the infimum of the so-called rate function, see 0]. 

Theorem 1 Assume that p is an invertible density matrix. The probability of that $" n 
is not a density matrix converges exponentially to as n — » oo. 

Proof. The expectation value of is p G M.k- Cramer's theorem tells us that there is 
a function / : M. k — > M + U {+00} such that for any open set containg p 

limsup - logProb($7 £ G) < - M{I(D) : D 6 M n \ G} 

n^oo n 

The RHS is strictly negative and if p is invertible, then we can choose G such that it 
consists of density matrices (that is, its elements are positive definite). This gives the 
proof. 

The computation of the rate function / is theoretically possible, but we do not need 
its concrete form. □ 

Although the expectation value of the unconstrained estimate is the true state, 
this does not mean that $™ is a good estimate. It may happen that the value of <3>^ n is 
outside of the state space with some probability. 



5 



Example 1 Consider the pure state 



= ^°o + °i)- ( 4 ) 

Then Zyy is a random variable r\\ such that Prob(?yi = 1) = Prob(?yi = 0) = 1/2, X 12 is 
1 (with probability 1) and Yyi is a random variable 772 such that Prob(r] 2 = ±1) = 1/2. 

One can compute that the expectation value of the determinant of $JJ n equals —3/2, 
independently of n. Therefore, in this example $" n is a rather bad estimate, for example, 
it is not true that the probability of indefinite estimate goes to as n — > 00, see also 
Figured □ 



P 



1 1 
1 1 



1.5 r 




Figure 1: The Hilbert-Schmidt distance between the true pure state (jH) and does 
not converge to as n —>■ 00. 

The properties of the unconstrained estimate $™ depend very much on the true state. 
If the eigenvalues of the true state are strictly positive (and not very small), then the 
estimate is rather good and the convergence is visible from the simulations, see Figure 
El and H3 The simulations are essentially simpler in the 2x2 case, when the boundary 
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of the state space consists of pure states and the positivity of the estimate can be seen 
from the length of the Bloch vector. In the 3x3 case the boundary is more complicated, 
it consists of the non-invertible densities. 
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Figure 2: The Hilbert-Schmidt distance between the true 3x3 state with eigenvalues 
0.1186, 0.2871, 0.5943 and the estimate. When the number of the measurement is more 
than 200, the unconstrained estimate gives really a positive semi definite matrix. 



3 Constrained estimate 

There are cases when <3>™ is not a positive semidefinite matrix, sometimes we call 
unconstrained estimate. The expectation value of is the true state of the system, 
so it is an unbiased estimate. 

We can use the method of least squares to get a density matrix: 

$ n := argmin.Tr ($7 - uf = argmin^ - , (5) 

where uj runs over the density matrices. The density matrices form a closed convex 
set Pfc, therefore the minimizer is unique. Note that for a qubit the closest positive 
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Figure 3: The fidelity between the true 2x2 mixed state and the estimate. When 
the number of the measurement is more than 10, the unconstrained and the constrained 
estimates are the same. 

semidefinite matrix is easy to find. If the values of the estimates are simply the Bloch 
vectors, then 

($™(x) if \\K n (x)\\ <1, 

®n(x) = I $un( x | (6) 

,, ° . . .. otherwise. 
I \\$™(x)\\ 

Theorem 2 The constrained estimate <3> n is asymptotically unbiased. 

Proof. We can use the fact that is unbiased and to show that $ n is an asymptot- 
ically unbiased estimate we study their difference. Let p(x) be the probability of the 
measurement result x and X is the set of outcomes such that ^ $ n ( x ), then 

evidently 

<C(*)p(z) - E ®n{x)p{x) = 53(*™(ar) - ® n (x))p(x) . (7) 

x x x£X 
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Figure 4: The fidelity between the true 2x2 pure state and the estimates. The uncon- 
strained estimate is often outside of the Bloch ball and in this case the (real part of the 
complex) fidelity can be bigger than 1. The constrained estimate converges to the true 
state. 



If Pfc C M.k is the set of density matrices, then X is the set of outcomes x such that 
$^ n (x) ^ T>k- Let us fix a norm on the space M. k - (Note that all norms are equivalent.) 

Let e > be arbitrary. We split X into two subsets: 

X x = {x G X : distance($7(x), V k ) < e} and X 2 = X \ X x . 
Note that distance (x), V k ) = \\&™(x) - $ ft (ar)||. Then 



\& n (x) ~ $ n (x)\\p(x) < ^ \\®n(x) ~ $ n (x)\\p(x) + ^ \\®n{x) ~ $ n (x)\\p(x) . 



xex 2 



The first term is majorized by e and the second one by CProb(Xi). Since the first is 
arbitrary small and the latter goes to 0, we can conclude that (jZj) goes to 0. □ 

Computing the constrained estimate. The computation of the minimizer of (0) 
is easier if <3>^ n is diagonal, assume that <£>^ n = Diag(xi, x 2 , ■ ■ ■ , x n ) and x x , x 2 , ■ ■ ■ , x k < 
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and Xk+i, Xk+2, ■ ■ ■ ,x n > 0. The minimizer is obviously diagonal, hence we need to solve 



argmm, 



^2( x i - yi) 2 



under the constraint yi > and YliVi = 1- According to the inequality between the 
quadratic and arithmetic means, we have 



are positive, then the minimizer is (yx, y2, ■ ■ ■ , y n ), where y\ — y 2 = . . . = y^ = and the 
other y^s are defined above. If the n-tuple (yi,y 2 , ■ ■ ■ , y n ) contains negative entries, then 
we repeat the procedure, the negative entries are replaced with and the actual value 
of c is added to the other entries. After finitely many steps we arrive at the minimizer. 
Figure El shows the details for n — 3. 

In the general case, we can change the basis such that becomes diagonal, since 
the Hilbert-Schmidt distance is invariant under this transformation. So let U^ n U* = 
Diag(xi, X2, • • • , x n ) for a unitary U . Then we compute the minimizer Diag(?/i, y 2 , . . . , y n ) 
using the above procedure and 



4 Estimations for a qubit 

The mean quadratic error matrix may be used to measure the efficiency of an estimate. If 
the unknown state is parametrized by 82, ■ ■ ■ , m ), then the mean quadratic error 
is an n x n matrix defined as 



In case of a qubit, the Bloch parametrization can be used. Then 9 = (61, 9 2 , 6*3)* belongs 
to the unit ball of R 3 . ((#1, 6 2 , 63)* means a column vector, so * may be regarded as the 
transpose.) 




If 




$ n = CTDiag(yi, y 2 , . . ■ , y n )U . 



Vn(0)ij := - i)(®n{x)j ~ Oj)p(x) (1 < i,j < Tl). 
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Figure 5: The constrained estimate for 3x3 matrices. The plain x + y + z= lof 
K 3 is shown. The triangle {(x,y,z) : x, y, z > 0} corresponds to the diagonal density 
matrices. Starting from the unconstrained estimate Diag(l/2, —1/2, 1), the constrained 
Diag(l/4, 0, 3/4) is reached in one step. Starting from Diag(l/6, — 1/2, 8/6), two steps 
are needed. 



Example 2 Assume that the observables 

A(i) = u(i) -a (1 < % < 3) 
are measured in the true state 



hr + e-a) = - 

2 V J 2 



i + e 3 e x - ie 2 
e l + \e 2 i - e 3 



where u(l), u(2) and u(3) are unit vectors in M 3 . The spectral decomposition of A(i) is 

l.^ + u(i)-<7) + (-l)i(/-u(i)-a) 



and 



Pi := Prob(A(i) = 1) 



i + u(i) ■ e 



If the measurements are performed r times, then ~Proh(A(i) = 1) is estimated by the 
relative frequency v{i\ of the outcome 1. The equations 



1 + u(i) ■ 9 
)r = V ; — (1 < i < 3) 
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should be solved to find an estimate. The solution is 

§ = T-\u(l) r ,u(2)rA^)ry 

where * denotes the transpose (of a row vector) and the matrix T is 



(9) 



T 



U(l)! U(l) 2 U(l) 3 

u(2)x u(2) 2 u(2) 3 
u(3)i u(3) 2 u(3) 3 



In particular, if each of the three measurements is performed once and the result is 
e = (£i,£ 2 ,£ 3 )*, then the unconstrained estimate is 



Similarly to Q, we have 



6 = T~ x p. 

The mean quadratic error matrix is the expectation of 

{T-h - 6){T- l £ - 9Y = T- 1 ((e -p)(e- pf) (T^ 1 )* 
and the computation yields 



(u(l)-^) 2 
1 






( U (2) • ey 

1 






( U (3) • ey 



When each measurement is performed r times, then 

r 

where n = 3r. If the observables <Ti, <j 2 and <r 3 are measured, then 



i -e\ o 

l-i 







1-9! 



(10) 



(12) 



□ 



Theorem 3 In the context of the previous example, the determinant of the average mean 
quadratic error matrix is the smallest, if the vectors u(l),u(2) and u(3) are orthogonal, 
that is, the observables A(l), A(2) and A(3) are complementary. 
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Proof. On the parameter space, Bloch ball, we consider the normalized Lebesgue mea- 
sure. (Any rotationally invariant measure may be considered and gives similar result.) 
Since 



JvJ 1 1 \9)d9 = T- 1 (/-|Diag((u(l)^) 2 ,( U (2)^) 2 ,(u(2)^) 2 )^)(T- 
= C^T)- 1 

with some positive constant C, the determinant is minimal if Det(T*T) = (DetT) 2 is 
maximal. Det T is the volume of the parallelepipedone determined by the three vectors 
u(l), u(2) and u(3), and it is maximal when they are orthogonal. □ 

The content of the theorem is similar to the result of [12], however in the approach 
of Wootters and Field not the mean quadratic error was minimized but the information 
gain was maximized. The complementary (or unbiased) measurements are optimal from 
both view point. 

Example 3 Let cr, = Pj — Qi be the spectral decomposition and let 

Fi = j and #+3 = ^ (l<i<3). 

be a POVM. The corresponding measurement is sometimes called standard qubit 
tomography ^H] and it has 6 outcomes with probabilities 

Pi = — ^— , Pi+3 = I 1 < 1 < 3)- 



The appropriate (unconstrained) state estimate 

= 1(1 + 3o-i) = -/ + 3P, $(^ + 3) = - 3<Ti) = -/ + 3Qi . 
is unbiased (1 < % < 3). If the true state is pe of (JHJ, then 

6 

The quadratic error matrix for n independent measurements is 



V° tand (9) = - 
n 



3 — 9 l —9\9i —9\9j, 
—9\92 3 — 9 2 — #2#3 
—9±9 3 —9 2 9 3 3 — 9 3 



(13) 



Proposition 1 In the context of the previous example, the complementary measurement 
is more efficient than the standard one, i.e. its mean quadratic error matrix is smaller. 
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Proof. To compare the efficiency of the standard measurement and the complementary 
measurement, we study the mean quadratic error matrices ()12j) and IjlHJ). The difference 
V* tand (6) - V n (9) has the form 



f 

n 



3 — Of —6162 
— 6162 3 — 62 
— #i#3 —0203 



—0 2 3 
3-9! 



3 
n 



1-Of 







— I 




1-91 








91 



1 


' 29\ 


—9\9 2 


"0103 " 


f 


"2 -1 -1 " 




" 0i " 


—9\9 2 


291 


—0203 


-f 2 -1 


■( 


02 


n 


—9i9 3 


—9 2 9 3 


20! 


n 


-f -1 2 




03 



[ 01 02 03 ] 



where o stands for the Hadamard product. Since the Hadamard product of two posi- 
tive semidefinite matrices is positive semidefinite, we have V^ tand (9) > V^ omp (9). The 
complementary measurement is more effective, than the standard one. □ 

Example 4 Consider the following Bloch vectors 



a i = 



<*2 = -^=(1,-1,-1) 



1 . , 1 

a 3 = —(-1,1,-1), a 4 = -^=(-1,-1,1). 

and form the positive operators 

Fi = -(<T + ai'<r) (l<i<4). 

They determine a measurement, Ylt=i ^ = I- The probability of the outcome i is 

1 



(14) 



Pi = Tr F it 



[1 + 04-9). 



The above POVM is called minimal qubit tomography by Rehacek, Englert and 
Kaszlikowski [IT)] . 

The matrix-valued estimator 



^rntn / 



-a + 6Fi (1 < t < 4). 



is unbiased. If the measurement is performed n times, then the average (written in 
vector-valued form) is 



n 



3 > — a 
^— ' n 



i=l 



(15) 



where is the number of the outcome % from the n measurements. The mean quadratic 
error matrix is 



v™ n (0) 
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3-0? ^303 - 0102 ^02-0103 

v/303 - 0102 3 - 01 v/30i - 2 3 

\/302 — 0103 V^301 — 0203 3 — 0| 



(16) 
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Unfortunately, the above matrix is not comparable with the mean quadratic error matrix 
()12j) . i.e. their difference is indefinite. However, TrV^ omp < r TiV^ im . 

□ 



5 Conclusion 

The estimation of the density matrix of a /c-level quantum system is studied in this 
paper. The essential ingredients of an estimation scheme are identified. Those are 
the parametrization of the density operator p, the observables to be measured, and 
the estimator mapping the measured values to an estimate of the density operator. 
The considered parametrization is given by the real and imaginary part of the entries, 
and they are estimated by independent measurements. A special set of commuting 
observables is defined in order to obtain measured values that are classical random 
variables. 

The unconstrained estimate gives a matrix which may be not positive definite and the 
constrained estimate is the closest density matrix with respect to the Hilbert-Schmidt 
distance. The constrained estimate is given by a simple procedure starting with the 
diagonalization of the unconstrained one. 

It is established that the properties of the estimation procedure depend very much on 
the invertibility of the true state. In case of an invertible true state, the unconstrained 
estimate becomes proper relatively fast. It has been found that for pure states the uncon- 
strained estimates, that are self-adjoint by construction, may not be positive semidefinite 
and this requires to apply a regularization called constrained estimation procedure. 

The estimation procedures carried out by different estimators are compared based on 
the biasedness of the estimates and their mean quadratic error matrices. In particular, 
several estimation schemes are compared for the unknown state of a qubit when a single 
qubit is measured at a time, and its density matrix is parametrized using the Bloch 
vector. It is shown that the average mean quadratic error matrix is the smallest if the 
applied observables are complementary. 

The results are illustrated by computer simulations. 
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